Emergence of deep neural networks (DNNs) has raised enormous attention towards artificial neural networks (ANNs) once again. They have become the state-of-the-art models and have won different machine learning challenges. Although these networks are inspired by the brain, they lack biological plausibility, and they have structural differences compared to the brain. Spiking neural networks (SNNs) have been around for a long time, and they have been investigated to understand the dynamics of the brain. However, their application in real-world and complicated machine learning tasks were limited. Recently, they have shown great potential in solving such tasks. Due to their energy efficiency and temporal dynamics there are many promises in their future development. In this work, we reviewed the structures and performances of SNNs on image classification tasks. The comparisons illustrate that these networks show great capabilities for more complicated problems. Furthermore, the simple learning rules developed for SNNs, such as STDP and R-STDP, can be a potential alternative to replace the backpropagation algorithm used in DNNs.
translated by 谷歌翻译
Semi-supervised learning (SSL) has made significant strides in the field of remote sensing. Finding a large number of labeled datasets for SSL methods is uncommon, and manually labeling datasets is expensive and time-consuming. Furthermore, accurately identifying remote sensing satellite images is more complicated than it is for conventional images. Class-imbalanced datasets are another prevalent phenomenon, and models trained on these become biased towards the majority classes. This becomes a critical issue with an SSL model's subpar performance. We aim to address the issue of labeling unlabeled data and also solve the model bias problem due to imbalanced datasets while achieving better accuracy. To accomplish this, we create "artificial" labels and train a model to have reasonable accuracy. We iteratively redistribute the classes through resampling using a distribution alignment technique. We use a variety of class imbalanced satellite image datasets: EuroSAT, UCM, and WHU-RS19. On UCM balanced dataset, our method outperforms previous methods MSMatch and FixMatch by 1.21% and 0.6%, respectively. For imbalanced EuroSAT, our method outperforms MSMatch and FixMatch by 1.08% and 1%, respectively. Our approach significantly lessens the requirement for labeled data, consistently outperforms alternative approaches, and resolves the issue of model bias caused by class imbalance in datasets.
translated by 谷歌翻译
To ensure proper knowledge representation of the kitchen environment, it is vital for kitchen robots to recognize the states of the food items that are being cooked. Although the domain of object detection and recognition has been extensively studied, the task of object state classification has remained relatively unexplored. The high intra-class similarity of ingredients during different states of cooking makes the task even more challenging. Researchers have proposed adopting Deep Learning based strategies in recent times, however, they are yet to achieve high performance. In this study, we utilized the self-attention mechanism of the Vision Transformer (ViT) architecture for the Cooking State Recognition task. The proposed approach encapsulates the globally salient features from images, while also exploiting the weights learned from a larger dataset. This global attention allows the model to withstand the similarities between samples of different cooking objects, while the employment of transfer learning helps to overcome the lack of inductive bias by utilizing pretrained weights. To improve recognition accuracy, several augmentation techniques have been employed as well. Evaluation of our proposed framework on the `Cooking State Recognition Challenge Dataset' has achieved an accuracy of 94.3%, which significantly outperforms the state-of-the-art.
translated by 谷歌翻译
大型神经语言模型(NLMS)的域适应性在预审进阶段与大量非结构化数据结合在一起。但是,在这项研究中,我们表明,经过验证的NLMS从紧凑的数据子集中更有效,更快地学习内域信息,该数据集中在域中的关键信息上。我们使用抽象摘要和提取关键字的组合从非结构化数据构建这些紧凑的子集。特别是,我们依靠Bart生成抽象性摘要,而Keybert从这些摘要中提取关键字(或直接的原始非结构化文本)。我们使用六个不同的设置评估我们的方法:三个数据集与两个不同的NLMS结合使用。我们的结果表明,使用我们的方法在NLM上训练的特定任务分类器,使用我们的方法优于基于传统预处理的方法,即在整个数据上随机掩盖,以及无需审计的方法。此外,我们表明我们的策略将预处理的时间降低了五倍,而这是香草预处理的五倍。我们所有实验的代码均在https://github.com/shahriargolchin/compact-pretraining上公开获得。
translated by 谷歌翻译
从不平衡数据中学习是一项具有挑战性的任务。在进行不平衡数据训练时,标准分类算法的性能往往差。需要通过修改数据分布或重新设计基础分类算法以实现理想的性能来采用一些特殊的策略。现实世界数据集中不平衡的流行率导致为班级不平衡问题创造了多种策略。但是,并非所有策略在不同的失衡情况下都有用或提供良好的性能。处理不平衡的数据有许多方法,但是尚未进行此类技术的功效或这些技术之间的实验比较。在这项研究中,我们对26种流行抽样技术进行了全面分析,以了解它们在处理不平衡数据方面的有效性。在50个数据集上进行了严格的实验,具有不同程度的不平衡,以彻底研究这些技术的性能。已经提出了对技术的优势和局限性的详细讨论,以及如何克服此类局限性。我们确定了影响采样策略的一些关键因素,并提供有关如何为特定应用选择合适的采样技术的建议。
translated by 谷歌翻译
近年来,无人驾驶飞机(UAV)在监视的背景下获得了重大吸引力。但是,从空中观察点捕获暴力和非暴力人类活动的视频数据集很少。为了解决这个问题,我们提出了一个新颖的基线模拟器,该模拟器能够生成参与各种活动的人群的光真实合成图像,这些序列可以归类为暴力或非暴力。人群组用使用语义分割自动计算的边界框注释。我们的模拟器能够产生大型的随机城市环境,并且能够在中端计算机上平均每秒保持25帧,并具有150个并发的人群相互作用。我们还表明,当来自现实世界数据增强所提出的模拟器的合成数据时,二进制视频分类精度平均提高了5%。
translated by 谷歌翻译
步行是人类陆地运动的最常见模式之一。步行对于人类进行大多数日常活动至关重要。当一个人走路时,其中有一个模式,被称为步态。步态分析用于体育和医疗保健。我们可以以不同的方式分析该步态,例如使用监视摄像机捕获的视频或在实验室环境中的深度图像摄像机。它也可以通过可穿戴传感器识别。例如,加速度计,力传感器,陀螺仪,柔性旋转仪,磁电阻传感​​器,电磁跟踪系统,力传感器和肌电图(EMG)。通过这些传感器进行分析需要实验室条件,否则用户必须佩戴这些传感器。为了检测人的步态作用异常,我们需要分别合并传感器。我们可以在发现后通过异常步态知道自己的健康状况。了解常规的步态与异常步态可能会使用智能可穿戴技术对受试者的健康状况有所了解。因此,在本文中,我们提出了一种通过智能手机传感器分析异常步态的方法。尽管如今,大多数人都使用了智能手机和智能手表等智能设备。因此,我们可以使用这些智能可穿戴设备的传感器来追踪他们的步态。
translated by 谷歌翻译
步行运动计划基于运动的不同组成部分(DCM)和线性倒置模型(LIPM)是可以实现的替代方案之一,以生成在线人类人体机器人步态轨迹。该算法需要调整不同的参数。在此,我们开发了一个框架来获得最佳参数,以实现Real Robot步态的稳定且节能的轨迹。为了找到最佳轨迹,在机器人的每个下肢关节下,代表能耗的四个成本函数,关节速度和应用扭矩的总和,以及基于零矩(ZMP)稳定性标准的成本函数。遗传算法用于框架中,以优化这些成本函数中的每一个。尽管轨迹计划是在简化模型的帮助下完成的,但通过考虑Bullet Physics Engine Simulator中的完整动力学模型和脚部接触模型,可以获得每个成本函数的值。这种优化的结果是,以最有效的方式行走的最稳定性和行走是相互对比的。因此,在另一次尝试中,对ZMP和以三种不同速度的能量成本函数进行了多目标优化。最后,我们比较了使用最佳参数生成的设计轨迹,并将模拟产生的仿真模拟器。
translated by 谷歌翻译
在自动驾驶汽车和自动驾驶系统的视觉系统中,交通标志检测是至关重要的任务。最近,基于变压器的新型模型为各种计算机视觉任务取得了令人鼓舞的结果。我们仍然观察到,香草VIT无法在交通符号检测中产生令人满意的结果,因为数据集的整体大小非常小,交通标志的类分布非常不平衡。为了克服这个问题,本文提出了一种具有局部机制的新型金字塔变压器。具体而言,金字塔变压器具有几个空间金字塔还原层,可通过使用严重的卷积将输入图像缩小并嵌入具有丰富多尺度上下文的令牌中。此外,它继承了固有的量表不变性归纳偏差,并能够在各种尺度上学习对象的本地功能表示,从而增强了网络的鲁棒性,以与流量标志的大小差异。实验是在德国交通标志基准(GTSDB)上进行的。结果证明了交通符号检测任务中提出的模型的优势。更具体地说,当将金字塔变压器应用于级联RCNN中时,将金字塔变压器在GTSDB中获得75.6%的地图,并超过了最知名和广泛使用的SOTA。
translated by 谷歌翻译
人们的个人卫生习惯在每日生活方式中照顾身体和健康的状况。保持良好的卫生习惯不仅减少了患疾病的机会,而且还可以降低社区中传播疾病的风险。鉴于目前的大流行,每天的习惯,例如洗手或定期淋浴,在人们中至关重要,尤其是对于单独生活在家里或辅助生活设施中的老年人。本文提出了一个新颖的非侵入性框架,用于使用我们采用机器学习技术的振动传感器监测人卫生。该方法基于地球通传感器,数字化器和实用外壳中具有成本效益的计算机板的组合。监测日常卫生常规可能有助于医疗保健专业人员积极主动,而不是反应性,以识别和控制社区内潜在暴发的传播。实验结果表明,将支持向量机(SVM)用于二元分类,在不同卫生习惯的分类中表现出约95%的有希望的准确性。此外,基于树的分类器(随机福雷斯特和决策树)通过实现最高精度(100%)优于其他模型,这意味着可以使用振动和非侵入性传感器对卫生事件进行分类,以监测卫生活动。
translated by 谷歌翻译